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1. Title of Invention 

A method for avoiding excessive overhead while using a form of SSA (Static Single 
Assignment) extended to use storage locations other than local variables. 

2. Claims 

(1) A method for avoiding excessive overhead by a programmed computer while using a 
form of SSA (Static Single Assignment) extended to use storage locations other than local 
variables, comprising the step of: 

allowing a program to use a compiler representation known as SSA form on any 
memory location addressable by the program; SSA form is normally only usable on 
function local variables. 

(2) The method as claimed in claim 1, further comprising the steps of: 

inserting phi functions at any place in the function where multiple definitions of a same 
non-SSA variable may be merged, the phi-functions producing a new definition of the 
variable at a point where they are inserted; 

finding which operations may implicitly read or write complex variables that are in 
SSA form; 

adding write-back copy operations at appropriate locations to write complex variables 
that are in SSA form, the write-back copy operations writing an SSA variable back to its 
real location; 

adding read-back copy operations at appropriate locations to read possibly modified 
values back into new SSA definitions, the read-back copy operations defining a new SSA 
variable from a variable's real location; and 

replacing every non-SSA variable definition by a definition of a unique SSA-variable, 
and replacing every non-SSA variable reference by a reference to an appropriate SSA- 
variable. 
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3. Detailed Description of Invention 

Technical Field to which the Invention pertains 
The present invention relates to Compiler Optimization. 

Prior Art 

This technique is related to an extension of the usual formulation of Static Single 
Assignment (SSA) form. 

Briefly, "SSA form* is an alternative representation for variables in a program, in which 
any given variable is only assigned at a single location in the program. A program is 
transformed into SSA form by a process called SSA conversion 1 . SSA conversion 
replaces every local variable in the source program with a set of new variables, called 
V SSA variables', each of which is only assigned to at a single physical location in the 
program — thus, every point at which a source variable V is assigned to in the source 
program, the corresponding SSA-converted program will instead assign a unique variable, 
VI, V2, etc. At any point in the program (always at the start of a basic-block) where 
the merging of control flow would cause two such derived variables to be live 
simultaneously, their values are merged together to yielding a single new SSA variable, 
e.g., V3, that represents the value of the 

original source variable at that point. This merging is done using a v phi-function'. A 
phi-function is an instruction which has as many inputs as there are basic-blocks that can 
transfer control to the basic-block it is in, and chooses whichever input corresponds to the 
basic-block that preceded the current one in the dynamic control flow of the program. 

SSA form is convenient because it allows variables to be treated as values, independent of 
their location in the program, making many transformations more straight-forward, as 
they don't need to worry about the implicit constraints imposed by using single variable 
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names to represent multiple values, depending on the location in the program. These 
properties make it a very useful representation for an optimizing compiler ~ many 
optimization techniques become significantly simpler if the program is in SSA form. 
For instance, in a traditional compiler, a simple common-sub-expression-elimination 
algorithm that operates on variables must carefully guard against the possibility of 
redefinition of variables, so that it generally is only practical to use within a single basic 
block. However, if the program is in SSA form, this simple optimization need not 
worry about redefinition at all » variables can't be redefined ~ and furthermore will work 
even across basic block boundaries. 

SSA conversion, as described above, is a transformation that is traditionally applied only 
to a function's local variables; this makes the process much easier, as local variables are 
subject to various constraints. For instance one knows that local variables are not 
aliased to other local variables, and unless its address has been taken, that a local variable 
will not be modified by a function other than the one it is declared in. 

However, there are many cases where * active values', which one would like to receive 
the benefits of optimizations made possible by using SSA-form, exist in storage locations 
other than local variables. In this case, one would like to have the object's fields receive 
the same treatment as if they were a local variable, which could yield optimizations. 

Information about SSA form can be found in the paper: 

[SSAFORM] K Efficiently Computing Static Single Assignment Form and the 
Control Dependence Graph', by Ron Cytron et al., ACM TOPLAS, Vol. 13, No. 4, 
October 1991, Pages 451-490. 

The SSA-conversion process in [SSAFORM] is performed in two steps [figure 12]: 
(a) [201] Phi functions are inserted at any place in the function where multiple 



ffi|I#2 0 0 1 



-3007556 



#2000-119 594 



definitions of the same non-SSA variable may be merged. The phi-functions produce a 
new definition of the variable at the point where they are inserted. 

Because of this step, there is only one extant definition of a source variable at any point in 
the program. 

(b) [202] Every non-SSA variable definition is replaced by a definition of a unique 
SSA- variable, and every non-SSA variable reference replaced by a reference to an 
appropriate SSA-variable — because of the insertion of phi-functions, there will always be 
a single extant SSA-variable corresponding to a given non-SSA variable. 

An extension of SSA form to non-local locations is described in: 

[SSAMEM] * Effective Representation of Aliases and Indirect Memory Operations in 
SSA Form 1 , by Fred Chow et al., Lecture Notes in Computer Science, Vol. 1060, April 
1996, Pages 253-267. 

The concept of basic-block dominance' is well known, and can be described as follows: 

A basic block A dominates' a basic block B, if the flow of control can reach B only after 
A (although perhaps not immediately; other basics blocks may be executed between 
them). 

If A dominates B and no other block dominates B that doesn't also dominate A, then A is 
said to be B f s v immediate dominator'. 

Problem to be solved by the Inv ention 
It is desirable to extend the use of SSA form to handle non-local memory locations. 
However, a straight-forward implementation given the prior art, which synchronizes SSA 
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representations at every point of unknown behavior, can be very inefficient, because there 
are many operations that may read or write almost *any* memory location (for instance, 
in the case of library function calls, where the compiler often has no information about 
their behavior). Using such a simple technique also causes many extra phi-functions to 
be introduced, which can dramatically increase the cost of using SSA form. 

This invention attempts to use SSA form on non-local memory locations, without 
excessive overhead for common program structures, by consolidating memory 
synchronization operations where possible. 

Means for solving Problem 
In this invention, we modify the procedure ot [SSAFORM][ figure 12] as 
follows: 

Method for representing pointer variables in SSA form[452]: 

+ References or definitions of memory locations resulting from pointer-dereferences are 
also treated as "variables', here called "complex variables' [452], in addition to simple 
variables [451], such as those used in the source program. Complex variables consist of 
a pointer variable and an offset from the pointer. An example of a complex variable is 
the C source expression (lvalue) " *p\ as used in [810] and [820]. 

Method for adding appropriate copy operations to synchronize complex variables [452] 
with the memory location they represent [figure 1]: 

4- These "complex variables' [452] are treated as non-SSA variables during SSA- 
conversion [figure 1] (any variable reference within a complex variable is treated as a 
reference in the instruction [440] that contains the complex variable [452]). 
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+ A new step [120] is inserted in the SSA-conversion process [figure 1] between steps (a) 
[110] and (b) [130], to take care of any necessary synchronization of SSA-converted 
complex variables [452] with any instructions [440] that have unknown side-effects: 

(a 1 ) [121] To any instruction [440] that may have unknown side-effects on an "active' 
complex variable [452] ~ one that is defined by some dominator of the instruction — add 
a list of the variable, and the possible side effects (mayjread, may_write). 

[122, 123] Next, insert special copy operations, called write-backs [521] (which write an 
SSA variable back to its real location) and read-backs (which define a new SSA variable 
from a variable ! s real location), to make sure the SSA-converted versions of affected 
variables [450] correctly synchronized with respect to such side-effects. This step may 
also insert new phi-functions, in the case where copying back a complex variable [452] 
from it's synchronization location may define a new SSA version of that variable. 

For an example of adding write-backs [521] and read-backs [522], see [figure 4]. 

Mode for carrying out the Invention 
This invention is an addition to a compiler for a computer programming language, whose 
basic control flow is illustrated in [figure 2]: 

A source program [301] is converted into an internal representation by a parser [310], and 
if optimization is enabled, the internal representation is optimized by the optimizer [320]. 
Finally, the internal form is converted into the final object code [302] by the backend 
[330]. In a compiler that uses SSA form, the optimizer usually contains at least three 
steps: conversion of the program from the V pre-SSA' internal representation into an 
internal representation that uses SSA form [figure 1], optimization of the program in SSA 
form [322], and conversion of the program from SSA form to an internal representation 
without SSA form [323]. Usually SSA form differs from the non-SSA internal 
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representation only in the presence of additional operations, and certain constraints on the 
representation; see [SSAFORM] for details. 

The preferred internal representation of a program used is as follows [figure 3]: 
A program [410] is a set of functions. 

A function [420] is a set of "blocks' [430], roughly corresponding to the common 
compiler concept of a "basic block'. A flow graph is a graph where the vertices are 
blocks [430], and the edges are possible transfers of control-flow between blocks [430]. 
A single block [430] is distinguished as the "entry block' [421], which is the block in the 
function executed first when the function is called. 

Within a block [430] is a sequence of "instructions' [440], each of which describes a 
simple operation. Within a block [430], control flow moves between instructions [440] 
in the same order as their sequence in the block; conditioned changes in control flow may 
only happen by choosing which edge to follow when choosing the successor block [432] 
to a block, so if the first instruction [440] in a block is executed, the others are as well, in 
the same 

sequence that they occur in the block [430]. 

An instruction [440] may be a function call, in which case it can have arbitrary side- 
effects, but control-flow must eventually return to the instruction [440] following the 
function call. 

An instruction [440] may explicitly read or write "variables' [450], each of which is 
either a "simple variable* [451], such as a local or global variable in the source program 
(or a temporary variable created by the compiler), or a "complex variable 1 [452], which 
represents a memory location that is indirectly referenced through another variable. 
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Each variable has a type, which defines what values may be stored in the variable. 

Complex variables [452] are of the form V *(BASE + OFFSET) 1 , where BASE [453] is a 
variable [450], and OFFSET [454] is a constant offset; this notation represents the value 
stored at memory location (BASE + OFFSET). 

Because of the use of complex variables [452], there are typically no instructions [440] 
that serve to store or retrieve values from a computed memory location. Instead, a 
simple copy where either the source or destination, or both, is a complex variable [452] is 
used. Similarly, any other instruction [440] may store or retrieve its results and 
operands from memory using complex variables. 

To assist in program optimization, each function is converted to SSA-form, which is 
described in (Prior Art) section, as modified for this invention, described in (Means for 
solving Problem), section. This conversion is called SSA-conversion, and takes place in 
3 steps [figure 1], (a), (a*), and (b): 

(a) [110] Phi functions are inserted at any place in the function where multiple 
definitions of the same variable may be merged, as described in [SSAFORM]. The 
phi-functions produce a new definition of the variable at the point where they are inserted. 
For example, the phi function [910] is inserted to merge the different values written to the 
complex variable v *p' at [911] ([820] in the input program) and [912] ([830] in the input 
program), and also at [1010], merging the values defined at [1011] ([820] in the input 
program) and [830] in the input program. 

Because of this step, there is only one extant definition of a source variable at any point in 
the program. 

(a 1 ) I. [121] For each operation, determine which "active 1 complex variables [452] it may 
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have unknown side-effects on, and list attach a note to the operation with this information. 
These notes are referred to below as x variable syncs'. In the example program, 
instructions [1020], [1021], [1022], and [1023] may possibly read or modify x *p' (as we 
don ! t have any information about them). 

II. [122] At the same time, add any necessary write-back copy operations [521] write 
back any complex variables [452] to their synchronization location'-- which is the 
original non-SSA variable (which, for complex variables [452], is a memory location), 
and mark the destination of the copy operation as such (this prevents step (b) of SSA 
conversion from treating the destination of the copy as a new SSA definition). Any such 
v write-back' [521] makes the associated variable inactive, and so prevents any further 
write-backs [521] unless the variable is once again defined. 

III. [123] Add necessary read-backs, to supply new SSA definitions of complex variables 
[452] that have been invalidated (after having been written back to their synchronization 
location). 

This is done by essentially solving a data-flow problem, where the values are "active 
read-backs', which are: 

+ Defined by operations that may modify a complex variable [452], as located in step I 
above, or by the merging of multiple active read-backs of the same variable [450], at 
control-flow merge points. In the example, all the function calls may possibly modify 
v *p\ so they must be represented by read-backs at [1020], [1021], [1022], and [1024]. 

+ Referenced by operations that use the value of a complex variable with an active 
read-back, or reaching a control-flow merge point at which no other read-backs of that 
variable are active (because such escaped definitions must then be merged with any other 
values of the complex variable using a phi-function). 
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Only read-backs that are referenced must actually be instantiated. In the example 
program, the only instantiated read-back is at [1030]. The reference that causes 
instantiation is the assignment of v *p' to the variable v x' , at [840] in the source program; 
in the SSA-converted program, this assignment is split between the read-back at [1030] 
and the phi function at [1031]. 

+ Killed by definitions of the associated complex variable [452], or by a new read-back 
of the variable. In the example, the read-back defined at [1021] is killed because the 
following function call defines a new read-back of the same variable at [1022], 

+ Merged, at control-flow merge points, with other active read-backs of the same 
variable [450], resulting in a new active read-back of the same variable. In the example, 
a v merge read-back* is defined at [1030], merging the read-backs of v *p' at [1022] and 
[1023]. 

After a fixed-point of read-back definitions is reached, those that are referenced are 
instantiated by inserting the appropriate copy operation at the place where they are 
defined, to copy the value from the read-back variable [450]' s synchronization location 
into a new SSA variable; if necessary new phi-functions may be inserted to reflect this 
new definition point. As 

mentioned above, in the example this only happens at [1030]. 
Steps (a*. I) [121] and (aMI) [122] take place as follows: 

Call the procedure x add_syncs_and_write_backs ! [figure 5] on the function's entry block 
[430], initializing the ACTIVEVAR1ABLES and ALLACTIVEVARJABLES 

parameters to empty lists. 
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The procedure "add_syncs_and_write_backs', with arguments BLOCK, 
ACTIVEVARIABLES, and ALLACTIVEVARIABLES is defined as follows [figure 
5]: 

[610] For every instruction [440] in the BLOCK, do: 

[620] For each VARIABLE in ALL ACTIVE VARIABLES, do: 

[621] If INSTRUCTION may possibly read or write VARIABLE, then [622] add a 
"variable sync' describing the possible reference or modification to INSTRUCTION. 

[625] If INSTRUCTION may possibly read or write VARIABLE, and is also in 
ACTIVE_VARIABLES, then [626] add a "write-back' copy operation just before 
INSTRUCTION to write VARIABLE back to its synchronization location, and [627] 
remove VARIABLE from ACTIVE VARIABLES. Because at this stage of SSA 
conversion, only source variables are present (not SSA variables), then this write-back 
copy operation is represented by a copy from VARIABLE to itself ("VARIABLE : = 
VARIABLE') with a special flag set to indicate that the destination should not be SSA- 
converted. 

[630] For each VARIABLE which is defined in INSTRUCTION, do: 

Add VARIABLE to ACTIVEVARJABLES and ALLACTIVEVARIABLES 
(modifications to these variables are local to this function). 

[650] For each block [430] immediately dominated by BLOCK, DOM, do: 

[651] Recursively use add_syncs_and_write_backs on the dominated block DOM, with 
the local values of ACTIVE VARJABLES and ALL ACTIVE VARIABLES passed as 
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the respectively named parameters. 

Step (a 1 .III) take place as follows [figure 6]: 

[701] Initialize the mappings BLOCK_BEGIN_READ_BACKS and 
BLOCK END RJEAD BACKS to be empty. These mappings associate each block in 
the flow graph with a sets of read-backs. 

[702] Initialize the queue PENDING_BLOCKS to the function's entry block. 

[710] While PEN D ING_BLOCKS is not empty, [711] remove the first block [430] from 
it, and invoke the function v propagate_block_read_backs' [SO0] on that block. 

[720] For each read-back RB in any block [430] that has been marked as 'used', and 
[721] isn't a * merge read-back' who's sources (the read-backs that it merges) are all also 
marked v used f , instantiate that read-back as follows: 

[730] If RB is a * merge read-back 1 , then the point of read-back is [741] the beginning of 
the block [430] where the merge occurs, otherwise it is [742] immediately after the 
instruction [440] that created the read-back. 

[731] Add a copy operation at the point of read-back that copies RB's variable from its 
synchronization location to an SSA variable (as noted above for adding write-back copy 
operations, because at this stage no SSA variable have actually been introduced, this copy 
operation simply copies from the variable to itself, but marks the source of the copy with 
a flag saying not to do SSA conversion). 

[732] If necessary, introduce phi functions to merge the newly defined SSA variable with 
other definitions of the variable. 
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The function propagate_block_read_backs* , with the parameter BLOCK, is defined as 
follows [figure 7]: 

[801 J Look up BLOCK in BLOCK_BEGIN_READ__BACK3 and 
BLOCK_END_READ_BACKS, assigning the associated read-back set with the local 
variables OLD_BEGlN_READ_BACKS and OLD END READ BACKS respectively. 
If there is no entry for block in either case, add an appropriate empty entry for block. 

[810] Calculate the intersection of the end read-back sets for each predecessor block [431] 
of BLOCK in the flow-graph, calling the result NE W_BEGIN_READ_BACKS . The 
intersection is calculated as follows: 

Any predecessor read-back for which a read-back of the same variable doesn't exist in 
one of the other predecessor blocks is discarded from the result; it is also marked as 
"referenced 1 . 

If the read-back for a given variable is the same read-back in all predecessor blocks [431], 
that read-back is added to the result. 

If a given variable is represented by different read-backs in at least two predecessor 
blocks [431], a * merge read-back 1 is created that references all the corresponding 
predecessor read-backs, and this merge read-back is added to the result. 

[820] If NEW_BEGIN_READJBACKS is different from 

OLD_BEGIN_READ_BACKS, or this is the first time this block has been processed, 
then: 

[821] Add- NEWJBEGINREAD_BACKS as the entry for BLOCK in 
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BLOCK_BEGIN_READ_BACKS, replacing O LDBEG I N_RE AD_B AC KS . 
[822] Initialize NEW END READ BACKS from N E W_B EG I N_RE AD_B AC KS . 
[830] For each operation INSTRUCTION in BLOCK, do: 
[840] For each variable reference VREF in INSTRUCTION, do: 

[845] If VREF has an entry RB in N E W EN DREADBACKS , then [846] Mark RB as 
used, and [847] remove it from N E W E N D READ BACKS. 

[850] For each variable definition VDEF in INSTRUCTION, do: 

[855] If VDEF has an entry RB in NEW_END_READ_BACKS, then [856] remove RB 
from N EW E N D RE AD B ACKS . 

[860] For each variable sync in INSTRUCTION that notes a variable VARIABLE as 
possibly written, do: 

[865] Add a new read-back entry for VARIABLE to NEWENDREADBACKS, 
replacing any existing read-back of VARIABLE. 

[870] If NEW END READ BACKS is different from OLD EN D READ BACKS , 
then: 

[871] Add NEWENDREADBACKS as the entry for BLOCK in 
BLOCK EN D RE AD BACKS , replacing OLD END RE AD BACKS. 

[880] Add each BLOCK'S successors [432] to PENDING_BLOCKS. 
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(b) [130] Every non-SSA variable definition is replaced by a definition of a unique SSA- 
variable, and every non-SSA variable reference replaced by a reference to an appropriate 
SSA-variable, as described in [SSAFORM]. 

The exception to this rule is complex variables [452] that have been marked as special 
v synchronization 1 locations, in the copy instruction [440] inserted in step (a 1 ); they are 
left as-is, referring to the original complex variable [452]. 

An example of a program being transformed into SSA form, with and without the use of 
this invention, can be found in figures 8-11. 

Effect of the Invention 
This invention adds synchronization operations that allow the efficient use of SSA- form 
for non-local memory locations in the presence of possible aliasing. 

4. Brief Description of Drawings 

Figure 1 shows general form of the SSA-conversion process used by this invention. 
Figure 2 shows overall compiler control flow. 

Figure 3 shows basic data structures used in describing this invention. 
Figure 4 shows placement of variable read- and writc-backs. 

Figure 5 shows the control flow of the procedure for steps (a 1 . 1) and (a 1 . II) of the 
modified SSA conversion process, adding variable synchronization information to 
instructions and adding variable write-backs to a function, v add_syncs_and_writc_backs' . 
Figure 6 shows the control How of the procedure for step (a 1 .III) of the modified 
SA conversion process, adding variable read-backs to a function. 

Figure 7 shows the control tlow lor a subroutine used by step (a* .III) of the modified SSA 
conversion process, ^add_merged_rcad_baeks\ 
Figure 8 shows example source program. 
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Figure 9 shows SSA converted program, with simple implementation of read-backs. 
Figure 10 shows SSA converted program, with the implementation of read-backs 
described in this patent. 

Figure 11 shows register-allocated and SSA-unconverted program. 
Figure 12 shows general form of the traditional SSA-conversion process. 

Description of the Reference Nume rals 

301 Input source file 
320 Optimizer 

302 Output file 
410 Program 

420 Function 

421 Entry block 
430 Block 

440 Instruction 
450 Variable 
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Figure 1 . Modified SSA-con version process 
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Figure 2 . Overall compiler 
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Figure 4. Placement of read/write-backs for the SSA form of *x, (*x)'1 
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Figure 5. The procedure < add_syncs_and_write_backs' 
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done 



For every Instruction in BLOCK, 
INSTRUCTION, do: 



(610] 



For each block 
dominated by BLOCK, 
DOM. do: 



[650] 



Recursively call 
add_syncs_and_wnte_backs 
on DOM, passing copies 
ot ACTIVE. VARS and 
ALL_ACTIVE_VARS. 



[651] 



Add every variable defined 
In INSTRUCTION to both 
ACTTVE_VARS and 
ALL_ACTIVE_VARS. 




[620] 



[630] 



done 



(621] 



Add a variable synchronization 
entry to INSTRUCTION, 
describing the possible 
modification. 



[622] 




[625] 



Add a copy operation before 
INSTRUCTION that copies 
VARIABLE to VARIABLE, and 
has a flag to prevent the 
destination from being 
SSA-converted. 



[626] 



Remove VARIABLE 
from ACTIVE.VARS. 



[627] 
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Figure 6. Conversion step (a'. Ill), insertion of read-backs 



Done 



Inputs: a function 



Initialize the r 
block_begin_r£AdJ6ack 

and BLOCK_END_READJ3ACKS 
to empty mappings. 



Initialize the queue 
PENDING.BLOCKS to 
contain only the function's 
entry block. 



[702] 



done 



[710] 



is not empty, do: 

s \ ' 




Invoke the procedure 
*propagate_blcck_read backs* 
on the head of the queue, 
after removing its from the queue. 


[711] 





done 



For each read-back, 
RB, that has been marked 
as 'used', do: 



f720] 




[730] 



[721] 



yes 



[742] 



Define the road-back-point' 

as being just after the 
operation that caused RB to 
be created. 




Define the 'read-back-potnf 
as being the. beginning of 
the block that RB represents 
the merge of. 



[7*1] 



Add a copy operation at the 
read-back-pot nt that copies 
RB's varaible to itself, but also 
marks the source of the copy 
so that it doesnt have 
SSA variable conversion 
performed on it. 



[731] 



[732] 



If any new phi-functions are 
necessary because of a new 
definition at the read- back-point, 
add them. 



6 



mtiE# 2001-3007556 



#2000—119594 



Figure 7. The procedure 'propagate_read_backs' 



Done 



Inputs; BLOCK, ACTJVE_VARS. ALL_ACTJVE_VARS 



Add SUCC to PENDING.BLOCKS. 




done 



For eacfi successor, SUCC, 
in BLOCK'S successor list 
do: 



[880] 



Make NEW_ E ND_R E AD_B ACKS 
the entry for BLOCK in 
BLOCKJEND_READ_BACKS. 



{8711 



Lookup BLOCK in BLOCK_BEGIN_READ_BACKS 
and BLOCK_END_READ_8ACKS. assigning the 
results to OLD_BEGIN_READ_BACKS and 
OLD_END_REAO_BACKS respectively, and 
using an empty set where BLOCK has no entry. 



[801] 



Define NEW_BEGIN_READ_BACKS to be the 
intersection of the BLOCK END READ_BACKS 
value of ail of BLOCK'S predecessor blocks. 
Where two or more different read-backs for the same 
variable are present, a "merge read-back' is 
created to combine them, starting at BLOCK. 



[810] 




[820] 



Make NEW_BEGIN_READ_BACKS 
the entry for BLOCK in 
BLOCK_BEGIN_READ_BACKS. 



Define a new read-back set. 
NEW_END_READ_BACKS, initialized 
from NEW_BEGIN_READ BACKS. 



[821] 



[822] 



Add a new read-back 
entry for VARIABLE to 
NEW_END_READ_BACKS. 
replacing any existing 
entry for VARIABLE. 




7i ope re 
INSTRUCTION, 
in BLOCK, do: 




For e ach variable sync in 
INSTRUCTION that notes a 
variable, VARIABLE, as 
possibly written, do: 



[865] 



[860] 



For each variable reference, 
VREF, 
in INSTRUCTION, do: 



[840] 



[847] 




[845] 



Remove the read-back from 
NEW_ EN D_READ_ BACKS. 



Mark that read-back 
as *used\ 



[846] 



Remove the read-back 
from NEW_ENO_- 
READ BACKS. 



[856] 
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Figure 8 : Example source program 



This short C program is used to illustrate the invention: 

extern int g <), h (), i (), x; 

int foo (int *p) 

{ 

(*p)++; (8103 
if (*p > 10) 

{ 

g (); 

h (); 

if (x > 5) 

g <>; 
if (x > 3) 

i (); 
else 

X = *p; 
*P = 5; 

} 

return *p; 



Here's the same program converted 

int foo (int *p) 
{ 

blockl : 

*p := *p + 1; 
if (*p <= 10) 
goto block8; 

block2 : 
g (); 
h (); 

if (x <= 5) 
goto block4; 

block3 : 
g (); 

block4: 

if (x > 3) 

goto block6 ; 

block5 : 
x := *p; 
goto block7; 

block6 : 
i (); 

block7 : 
*p := 5; 

block8 : 

return *p; 



to a slightly more primitive form: 



[820] 



[840] 



[830] 
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Figure 9: SSA converted program, with simple implementation of read-backs: 

The following is psuedo-C, augmented with the *phi' operation, where 

RESULT = phi (blockl : VALl , . blockN: VALN) 

means 'assign VALl to RESULT if control-flow comes from blockl', and 
similarly so on for each value of N. 

The extra variables "pvN" , where N is an integer, are SSA versions 
of *p, and are in fact local variables, not dereferences of p. 

int foo (int *p) 
{ 

int pvl , pv2 , pv3 , pv4 , pv5 , pv6 ; 

blockl : 

pvl = *p + 1; 
if (pvl <= 10) 
goto block8; 

block2 : 

*p = pvl; /* This writes-back PV1 to *P. */ 
g (); 

pv2 = *p; /* This reads-back *P into PV2 . */ 

*p = pv2; /* This writes-back PV2 to *P. */ 
h (>; 

pv3 = *p; /* This reads-back *P into PV3 . */ [912] 
if (x <= 5) 
goto block4; 

block3 : 

*P = pv3; /* This writes-back PV4 to *P. */ 

g (); 

pv4 = *p; /* This reads-back *P into PV4 . */ [911] 

block4 : 

pv5 = phi (block3: pv4, block2 : pv3 ) [910] 
if (x > 3) 

goto block6; 

block5 : 

goto block7; 

block6 : 
i (); 

block7 : 

x = phi (block6 : x, blocks : pv5) ; 
block8 : 

pv6 = phi (blockl: pvl, block7 : 5); 

*p = pv6; /* This writes-back PV6 to *P. */ 

return pv6; 

} 
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Figure 10: SSA converted program, with the implementation of read-backs 
described in this patent 



int foo (int *p) 
{ 

int pvl , pv2 , pv3 ; 

blockl: 

pvl = *p + 1; 
if (pvl <= 10) 
goto block8; 

block2 : 

*P = pvl; 
g (); 
h (); 

if (x <= 5) 
goto block4; 

block3 : 
g (); 

block4 : 

pv2 = *p ; 
if (x > 3) 
goto block6; 

blocks : 

goto block7; 

block6 : 
i (); 



/* This writes-back PV1 to *P. */ 



/* This reads-back *P into PV2 . 



block7 : 

x = phi (block6: x, blocks : pv2); 
block8: 

pv3 = phi (blockl: pvl, block7 : 5); 

*p = pv3; /* This writes-back PV3 to *P. 

return pv3; 



[1011] 



[1021] 
[1022] 



[1023] 
[1030] 



[1024] 
[1031] 
[1010] 
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Figure li: Register-alloced and SSA-unconverted program 



Using SSA-form requires having a good register allocator that will 
merge variables where possible, as it tends to generate a lot of 
variables with short lifetimes. We assume that here. 

int foo (int *p) 
{ 

int pv; 

blockl : 

pv = *p + 1; 
if (pv <= 10) 
goto block8; 

block2 : 

*p = pv; /* This writes-back PV to *P. */ 

g (); 
h (); 

if (x <= 5) 
goto block4; 

block3 : 
g (); 

block4 : 

if <x > 3) 

goto block6; 

blocks : 

X = *p; 

goto block7; 

block6 : 
i (); 

block7 : 
pv = 5; 

block8: 

*P = pv; /* This writes-back PV to *P. */ 

return pv; 

} 
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Figure 12. Original SSA-conversion process 



1 



Inputs are: function: a function to be SSA converted 



Insert phi-functions 



Replace source- 
variables with SSA 
variables 



[201] 



[202] 
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1. Abstract 
subj ective 

The usual formulation of the compiler representation known as v SSA-form ! can only 
handle local variables. It is desirable to extend this to allow other locations to be 
represented. 

means for solu tion 

This invention adds synchronization operations that allow the efficient use of SSA-form 
for non-local memory locations in the presence of possible aliasing. 

2. Representative Drawing 
Figure 1 
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[0 0 4 8] 

V - hVty*)£|fc©H5£j&{c»3£Lfc«, U- FAyVmm [4 5 0] (Dmffl 

&s^$t i^s s Agg»/\'(it&=rfcr- , rsfe«>jc, mm^tit=. ft<b# 

£fg $ ft feift iS^& n t:° - jftfls £ # At" -5 3 £ (C J: »J >f > * # > * $ ft 
ftT*»<fcV^ ±}&<Z) <fc e> fC, CKDMtCfcV^T, 3ftt£ [1 0 3 0] Kf5^X<Dfr 
[0 0 4 9] 

SPB ( a ' . I ) [12 1] &l>* ( a ' . II) [ 1 2 2 ] tt#C0> «fc 3 IZftt) 
ft-5o 

[0 0 5 0] 

RHtfcCD:n>hi; • "f'Qv 9 [4 3 0] (D#Jd' add_syncs_and_write_backs' 
[H5] &Pftf£IJU U X h£3g&Ct-££#>tC> ACTIVE_VARIABLES 2fctf ALL_AC 
T I VE_VAR I ABLES 7\° 9 ^ - # & ^gB-f fc f -5 . 
[0 0 5 1 ] 

eilfc BLOCK, ACT I VE_VAR I ABLES &t>* ALL_ACT I VE_VAR I ABLES £ M "T -5 # UK' a 
dd_syncs_and_write_backs' J: 3 {C^i£$ ft £ [0 5 ] „ 

[0 0 5 2] 

[6 10] ^□-y^fflf^tO^ [4 4 0] Sr^ff-T-S- 
[0 0 5 3] 
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RUBLES WtC&ftJi, ^KS-eoraRHaSJC^SM-rfeftlC, #CtC [6 2 6] ife 
^©[eirftC' h/^^' 3 H°- Ms [6 2 7] ACT I VE_VAR I ABLES 
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S S A$t&(DZ.<Dmmz&^T. (SSA»li*<) 

/\<Z)3tf- (' VARIABLE := VARIABLE' ) (C J: »J ^3 ft, -r*^ * - S/ 3 > fc* S 
S A & $ n S ^ £ T? ti & V * Z H tt a* "T tc ft (C # g [J & 7 9 ^ # IS j£ $ ft -5 . 
[0 0 5 6] 
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[0 0 5 8] 

[6 5 0] DOM^ny^JCj: Uitg51$tlS#yD >^ [4 3 0] Sr^ff 
[0 0 5 9] 

[6 5 1] 5KHi3ftTl^5 DOM^n v ^±CD add_syncs_and_write_backs 
SrllWWlCtejgU ACT I VE_V AR I ABLES Rtf ALL_ACT I VE_VAR I ABLES 

ffi * ft -?ft i& is & # e> ft * * i: l t m S ft -S - 

[0 0 6 0] 

Jgpg (a' . Ill) tt^©«k^tCff*)ft* CHI 6] „ 
[0 0 6 1 ] 

[7 0 1] 7*;bf>^ BL0CK_BEG I N_READ_BACKS RZ$ BLOCK_END_READ_BACKS 
*Q~Z3bZ> &oK®fflit-t& a £ft&©vy bTv^fcJu 7n-^77fi®#^D 

[0 0 6 2] 

[7 0 2] #*>4t#I PENDING.BLOCKS &BHBt0>x> h U • y ^ JC«J5B-fb"T 

&. 

[0 0 6 3] 

[7 10] PEND I NG_BL0CKS *ts£T?«&V^HHC [7 11] (DTTlyP 
[4 3 0] &-€-ftJ&*e>Hft*L/, -fO^Dy ^©WR' propagate_block_read_b 
acks' [8 0 0] <fenftfff$f. 
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[7 2 0] ' mm*?' £.'?-9i<tl1L'*'<T<D-7Uy9 [4 3 0] fi<D&V - 
FAy*RBlC*fLT, &t=. [7 2 1] f©V-X (-eti^v- VtZ> U - FA 
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- FAy ?£-f >X#>*£^f -5 0 
[0 0 6 5] 

[7 3 0] RB^' v- V • U - FA v ? ' Tf&tltf, U - FA V Z <DgJZ [ 
7 4 1] 7-^^it5^D>y^ [4 3 0] tf>:&J#)T*& U , m&m&lZlt [7 4 
2] U-hV1y^S:i!lffil/fc#^ [4 4 0] tf)itt^T*&3o 
[0 0 6 6] 

[731] RB©3E»S:^©raSBffifi*ve, S S ASBRS Tf3 tf-f-S U - FA 
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[0 0 6 7] 

[732] ^mx&mz. ffitc.\zfemzntc.s s Amk*mm<nmcoi&mz.~? 

[0 0 6 8] 
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[0 0 6 9] 

[8 0 1] BL0CK_BEG I N_READ_BACKS ,&t>\ BLOCK_END_READ_BACKS \H<DZfU 
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[0 0 7 0] 
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[0 0 7 1] 
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[0 0 7 5] 

[8 2 2] NEW_BEG I N_READ_BACKS j&* & NEW_END_READ_BACKS &3aJ$tf:t" 
[0 0 7 6] 

[8 3 0] :/n*y *ft<D#|g#i^fH;:*fLT. ^ffi"S. 
[0 0 7 7] 
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[0 0 7 8] 

[8 4 5] VREF & NEW_END_READ_BACKS t*J(C^ @RB ^M^t Z> & [84 
6] RB fc^/B^ilV-? U [8 4 7] NEW_END_READ_BACKS j^bBSS* 

[0 0 7 9] 
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L JH c add_syncs_and_write_backs' 



XJi : BLOCK, ACTIVE.VAFS, ALL _ACT1VE_VARS 




INSTRUCTION, £HfT 



[650] 
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add_syncs_and_write_backs $ 
llffiW!Cfl?UttlL,ACTIVE_VARSai; 
ALL_ACT1VE_VARS O^—^^Ct 



[6S1J 




active_vaos 
all_active_vars 
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VARIABLE Sl^-T^ 
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PENDING_BLOCKSlC 
SUCC £itflnt"& 
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NEW_END_READ_BACKS £ 
BLOCK_END_READ_BACKS fa(T> 



1871] 



[B70J 
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8LOCK_ENO_REAO_BACKS P^CO^O^^^^^L, 
-£-<7>£^£ OLD_BEGIN_READ_BACKS St/ 
OLD_END_READ_ BACKS d^ft-^+LftJ l J^T. 



(801] 



NEW_BEGIN_READ_BACKS £f ^<"C0>:7D*y£0> 
SfcfT:? BLOC K_EN D_ R EA D_BACKS fi<D 



[810] 



j— 



fi E W_BEG IN_ READ_B ACKS I? 
"OLD_BEGIN_READ_BACKS tW£&fr* " 



[820] 



J— 



NEW_BEG!N_READ_BACKS £ 
BLOCK_BEGIN_READ_BACKS t*9<D 



N EW_B EG 1 N_R EAD_B ACKS frb%}P&it 

Ztitz SfLl >'J — K/ vy? -tryK 

NEW_END_READ_BACKS £^&f& 



[821] 
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NEW_END_READ_BACKS <T> 
*«K»LT m U *M — K/ < ^ 
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extern int g () , h (), i (), x; 

int foo (int *p) 

{ 

(*p)++; 
if <*p > 10) 
C 

g (); 
h (); 

if (x > 5) 

g <); 
if (x > 3) 

i <>; 
else 

x = *p; 
*P = 5; 

} 

return *p; 

) 



[810] 



int foo (int *p) 
{ 

blockl: 

*p := *p + 1; 
if (*p <= 10) 
goto block8; 



[820] 



block2 : 
g (), 
h {), 
if (x <= 5) 
goto block4; 



block3 : 
g (); 

block4: 

if (x > 3) 
goto block6 ; 



blocks : 



goto block7; 



[840] 



block6 : 
i (); 



block7: 
*P := 5; 



[830] 



block8 : 

return *p; 



8 



£tJfE# 2001-3007556 



^2000-119594 



[09] 



RESULT = phi (blockl : VAL1, blockN: VALN) 

tau rummntfyavoi a*63fc§#5«. vali s result icsjo^tsj 

int foo (int *p) 
{ 

int pvl, pv2, pv3, pv4 , pv5, pv6; 

blockl: 

pvl = *p + 1; 

if (pvl <= 10) 
goto block8; 

block2 : 

*p = pvl; /* cni3PV15*PA7i'W^C?5 */ 

g (); 

pv2 = *p; /* cn« *P£ PV2 au-km^-^s */ 

*p = pv2; /* cnBPV2 5*P^f h/^^S */ 

h (); 

pv3 = *p; /* Cntt*P5PV3MJ-R^^S */ [912] 

if (x <= 5) 
goto block4; 

block3 : 

*p = pv3; /* CnS PV4 5*PA^-fhM^T§ */ 

g (); 

pv4 = *p; /* Cn«*P5 PV4^U-h : / , C^^-rS */ [911] 

block4 : 

pv5 = phi (block3: pv4, block2 : pv3) [910] 
if (x > 3) 
goto block6; 

blocks : 

goto block7; 

block6 : 
i <); 

block7 : 

x = phi (block6 : x, blocks : pv5) ; 
block8 : 

pv6 = phi (blockl: pvl, block7: 5) ; 

*p = pv6; /* cn« PV6 S*P / \^-fh/ , C^^^S */ 

return pv6; 

} 
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[010] 



int foo (int *p) 
{ 

int pvl , pv2 , pv3 ; 
blockl : 

pvl = *p + 1; [1011] 
if {pvl <= 10) 
goto block8; 

block2 : 

*p = pvl; /* cnfapvis+p^-rw^^TS */ 

g <) ; [1021] 

h (); [1022] 
if (x 5) 
goto block4; 

block3 : 

9 O; [1023] 
block4 : 

pv2 = *p; /* CniS*PSPV2AU-K/V2/^S */ [1030] 

if (x > 3) 
goto block 6; 

blocks : 

goto block7; 

block6 : 

i (); [1024] 
block7 : 

x = phi (block6: x, blockS: pv2) ; [1031] 
block8: 

pv3 = phi (blockl: pvl, block7 : 5); [1010] 
*p = pv3; /* cntt PV3 § *P ^< h/V^^7T§ */ 

return pv3; 

} 
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[HI 1] 



anus?:** • 7Da-55f^§i«g^s 

int foo (int *p) 
{ 

int pv; 

blockl : 

pv = *p + 1; 
if (pv <= 10) 
goto block8; 

block2 : 

*p = pv; /* cnttPVS*PA5f Nty^S */ 

g (>; 
h (); 

if (x <= 5) 
goto block4; 

block3 : 
g (); 

block4 : 

if (x > 3) 
goto block6 ; 

blocks : 
X = *p; 
goto block7 ; 

block6 : 
i (); 

block7 : 
pv = 5; 

block8 : 

*P = PV; /* cn«PV5*P / \^Yh/V»/^^-S */ 

return pv; 

} 
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